AITopics | network-valued data

On clustering network-valued data

Neural Information Processing SystemsNov-20-2025, 23:44:02 GMT

Community detection, which focuses on clustering nodes or detecting communities in (mostly) a single network, is a problem of considerable practical interest and has received a great deal of attention in the research community. While being able to cluster within a network is important, there are emerging needs to be able to \emph{cluster multiple networks}. This is largely motivated by the routine collection of network data that are generated from potentially different populations. These networks may or may not have node correspondence. When node correspondence is present, we cluster networks by summarizing a network by its graphon estimate, whereas when node correspondence is not present, we propose a novel solution for clustering such networks by associating a computationally feasible feature vector to each network based on trace of powers of the adjacency matrix. We illustrate our methods using both simulated and real data sets, and theoretical justifications are provided in terms of consistency.

name change, network-valued data, node correspondence, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.79)
Information Technology > Data Science > Data Mining (0.61)

Add feedback

On clustering network-valued data

Soumendu Sundar Mukherjee, Purnamrita Sarkar, Lizhen Lin

Neural Information Processing SystemsNov-20-2025, 21:37:27 GMT

A network, which is used to model interactions or communications among a set of agents or nodes, is arguably among one of the most common and important representations for modern complex data.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

On clustering network-valued data

Soumendu Sundar Mukherjee, Purnamrita Sarkar, Lizhen Lin

Neural Information Processing SystemsNov-20-2025, 08:10:53 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Barycentric subspace analysis of network-valued data

Maignant, Elodie, Pennec, Xavier, Trouvé, Alain, Calissano, Anna

arXiv.org Machine LearningAug-1-2025

Certain data are naturally modeled by networks or weighted graphs, be they arterial networks or mobility networks. When there is no canonical labeling of the nodes across the dataset, we talk about unlabeled networks. In this paper, we focus on the question of dimensionality reduction for this type of data. More specifically, we address the issue of interpreting the feature subspace constructed by dimensionality reduction methods. Most existing methods for network-valued data are derived from principal component analysis (PCA) and therefore rely on subspaces generated by a set of vectors, which we identify as a major limitation in terms of interpretability. Instead, we propose to implement the method called barycentric subspace analysis (BSA), which relies on subspaces generated by a set of points. In order to provide a computationally feasible framework for BSA, we introduce a novel embedding for unlabeled networks where we replace their usual representation by equivalence classes of isomorphic networks with that by equivalence classes of cospectral networks. We then illustrate BSA on simulated and real-world datasets, and compare it to tangent PCA.

artificial intelligence, machine learning, subspace, (16 more...)

arXiv.org Machine Learning

2507.23559

Country:

Europe > Western Europe (0.04)
Europe > Switzerland (0.04)
Europe > Eastern Europe (0.04)
(8 more...)

Genre: Research Report (0.40)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Graphon based Clustering and Testing of Networks: Algorithms and Theory

Sabanayagam, Mahalakshmi, Vankadara, Leena Chennuru, Ghoshdastidar, Debarghya

arXiv.org Machine LearningOct-6-2021

Network-valued data are encountered in a wide range of applications, and pose challenges in learning due to their complex structure and absence of vertex correspondence. Typical examples of such problems include classification or grouping of protein structures and social networks. Various methods, ranging from graph kernels to graph neural networks, have been proposed that achieve some success in graph classification problems. However, most methods have limited theoretical justification, and their applicability beyond classification remains unexplored. In this work, we propose methods for clustering multiple graphs, without vertex correspondence, that are inspired by the recent literature on estimating graphons-- symmetric functions corresponding to infinite vertex limit of graphs. We propose a novel graph distance based on sorting-and-smoothing graphon estimators. Using the proposed graph distance, we present two clustering algorithms and show that they achieve state-of-the-art results. We prove the statistical consistency of both algorithms under Lipschitz assumptions on the graph degrees. We further study the applicability of the proposed distance for graph two-sample testing problems. Machine learning on graphs has evolved considerably over the past two decades. The traditional view towards network analysis is limited to modelling interactions among entities of interest, for instance social networks or world wide web, and learning algorithms based on graph theory have been commonly used to solve these problems (Von Luxburg, 2007; Yan et al., 2006).

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Machine Learning

2110.02722

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

A Multilayer Correlated Topic Model

Tian, Ye

arXiv.org Machine LearningJan-2-2021

We proposed a novel multilayer correlated topic model (MCTM) to analyze how the main ideas inherit and vary between a document and its different segments, which helps understand an article's structure. The variational expectation-maximization (EM) algorithm was derived to estimate the posterior and parameters in MCTM. We introduced two potential applications of MCTM, including the paragraph-level document analysis and market basket data analysis. The effectiveness of MCTM in understanding the document structure has been verified by the great predictive performance on held-out documents and intuitive visualization. We also showed that MCTM could successfully capture customers' popular shopping patterns in the market basket analysis.

machine learning, natural language, topic model, (19 more...)

arXiv.org Machine Learning

2101.02028

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
North America > Canada (0.04)

Genre: Research Report (0.82)

Industry: Consumer Products & Services > Personal Products (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.75)

Add feedback

On clustering network-valued data

Mukherjee, Soumendu Sundar, Sarkar, Purnamrita, Lin, Lizhen

Neural Information Processing SystemsFeb-14-2020, 19:43:21 GMT

Community detection, which focuses on clustering nodes or detecting communities in (mostly) a single network, is a problem of considerable practical interest and has received a great deal of attention in the research community. While being able to cluster within a network is important, there are emerging needs to be able to \emph{cluster multiple networks}. This is largely motivated by the routine collection of network data that are generated from potentially different populations. These networks may or may not have node correspondence. When node correspondence is present, we cluster networks by summarizing a network by its graphon estimate, whereas when node correspondence is not present, we propose a novel solution for clustering such networks by associating a computationally feasible feature vector to each network based on trace of powers of the adjacency matrix.

data mining, network-valued data, node correspondence, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.90)
Information Technology > Data Science > Data Mining (0.65)

Add feedback

On clustering network-valued data

Mukherjee, Soumendu Sundar, Sarkar, Purnamrita, Lin, Lizhen

Neural Information Processing SystemsDec-31-2017

Community detection, which focuses on clustering nodes or detecting communities in (mostly) a single network, is a problem of considerable practical interest and has received a great deal of attention in the research community. While being able to cluster within a network is important, there are emerging needs to be able to \emph{cluster multiple networks}. This is largely motivated by the routine collection of network data that are generated from potentially different populations. These networks may or may not have node correspondence. When node correspondence is present, we cluster networks by summarizing a network by its graphon estimate, whereas when node correspondence is not present, we propose a novel solution for clustering such networks by associating a computationally feasible feature vector to each network based on trace of powers of the adjacency matrix. We illustrate our methods using both simulated and real data sets, and theoretical justifications are provided in terms of consistency.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country: